831 research outputs found
On Orderings of Probability Vectors and Unsupervised Performance Estimation
Unsupervised performance estimation, or evaluating how well models perform on
unlabeled data is a difficult task. Recently, a method was proposed by Garg et
al. [2022] which performs much better than previous methods. Their method
relies on having a score function, satisfying certain properties, to map
probability vectors outputted by the classifier to the reals, but it is an open
problem which score function is best. We explore this problem by first showing
that their method fundamentally relies on the ordering induced by this score
function. Thus, under monotone transformations of score functions, their method
yields the same estimate. Next, we show that in the binary classification
setting, nearly all common score functions - the norm; the
norm; negative entropy; and the , , and Jensen-Shannon distances to
the uniform vector - all induce the same ordering over probability vectors.
However, this does not hold for higher dimensional settings. We conduct
numerous experiments on well-known NLP data sets and rigorously explore the
performance of different score functions. We conclude that the norm
is the most appropriate.Comment: IJCAI 2023 Workshop on Generalizing from Limited Resources in the
Open Worl
STILN: A Novel Spatial-Temporal Information Learning Network for EEG-based Emotion Recognition
The spatial correlations and the temporal contexts are indispensable in
Electroencephalogram (EEG)-based emotion recognition. However, the learning of
complex spatial correlations among several channels is a challenging problem.
Besides, the temporal contexts learning is beneficial to emphasize the critical
EEG frames because the subjects only reach the prospective emotion during part
of stimuli. Hence, we propose a novel Spatial-Temporal Information Learning
Network (STILN) to extract the discriminative features by capturing the spatial
correlations and temporal contexts. Specifically, the generated 2D power
topographic maps capture the dependencies among electrodes, and they are fed to
the CNN-based spatial feature extraction network. Furthermore, Convolutional
Block Attention Module (CBAM) recalibrates the weights of power topographic
maps to emphasize the crucial brain regions and frequency bands. Meanwhile,
Batch Normalizations (BNs) and Instance Normalizations (INs) are appropriately
combined to relieve the individual differences. In the temporal contexts
learning, we adopt the Bidirectional Long Short-Term Memory Network (Bi-LSTM)
network to capture the dependencies among the EEG frames. To validate the
effectiveness of the proposed method, subject-independent experiments are
conducted on the public DEAP dataset. The proposed method has achieved the
outstanding performance, and the accuracies of arousal and valence
classification have reached 0.6831 and 0.6752 respectively
On Achievable Rates of Line Networks with Generalized Batched Network Coding
To better understand the wireless network design with a large number of hops,
we investigate a line network formed by general discrete memoryless channels
(DMCs), which may not be identical. Our focus lies on Generalized Batched
Network Coding (GBNC) that encompasses most existing schemes as special cases
and achieves the min-cut upper bounds as the parameters batch size and inner
block length tend to infinity. The inner blocklength of GBNC provides upper
bounds on the required latency and buffer size at intermediate network nodes.
By employing a bottleneck status technique, we derive new upper bounds on the
achievable rates of GBNCs These bounds surpass the min-cut bound for large
network lengths when the inner blocklength and batch size are small. For line
networks of canonical channels, certain upper bounds hold even with relaxed
inner blocklength constraints. Additionally, we employ a channel reduction
technique to generalize the existing achievability results for line networks
with identical DMCs to networks with non-identical DMCs. For line networks with
packet erasure channels, we make refinement in both the upper bound and the
coding scheme, and showcase their proximity through numerical evaluations.Comment: This paper was presented in part at ISIT 2019 and 2020, and is
accepted by a JSAC special issu
iLoRE: Dynamic Graph Representation with Instant Long-term Modeling and Re-occurrence Preservation
Continuous-time dynamic graph modeling is a crucial task for many real-world
applications, such as financial risk management and fraud detection. Though
existing dynamic graph modeling methods have achieved satisfactory results,
they still suffer from three key limitations, hindering their scalability and
further applicability. i) Indiscriminate updating. For incoming edges, existing
methods would indiscriminately deal with them, which may lead to more time
consumption and unexpected noisy information. ii) Ineffective node-wise
long-term modeling. They heavily rely on recurrent neural networks (RNNs) as a
backbone, which has been demonstrated to be incapable of fully capturing
node-wise long-term dependencies in event sequences. iii) Neglect of
re-occurrence patterns. Dynamic graphs involve the repeated occurrence of
neighbors that indicates their importance, which is disappointedly neglected by
existing methods. In this paper, we present iLoRE, a novel dynamic graph
modeling method with instant node-wise Long-term modeling and Re-occurrence
preservation. To overcome the indiscriminate updating issue, we introduce the
Adaptive Short-term Updater module that will automatically discard the useless
or noisy edges, ensuring iLoRE's effectiveness and instant ability. We further
propose the Long-term Updater to realize more effective node-wise long-term
modeling, where we innovatively propose the Identity Attention mechanism to
empower a Transformer-based updater, bypassing the limited effectiveness of
typical RNN-dominated designs. Finally, the crucial re-occurrence patterns are
also encoded into a graph module for informative representation learning, which
will further improve the expressiveness of our method. Our experimental results
on real-world datasets demonstrate the effectiveness of our iLoRE for dynamic
graph modeling
- …